A topological data analysis-based method for gait signals with an application to the study of multiple sclerosis

In the past few years, light, affordable wearable inertial measurement units have been providing to clinicians and researchers the possibility to quantitatively study motor degeneracy by comparing gait trials from patients and/or healthy subjects. To do so, standard gait features can be used but they fail to detect subtle changes in several pathologies including multiple sclerosis. Multiple sclerosis is a demyelinating disease of the central nervous system whose symptoms include lower limb impairment, which is why gait trials are commonly used by clinicians for their patients’ follow-up. This article describes a method to compare pairs of gait signals, visualize the results and interpret them, based on topological data analysis techniques. Our method is non-parametric and requires no data other than gait signals acquired with inertial measurement units. We introduce tools from topological data analysis (sublevel sets, persistence barcodes) in a practical way to make it as accessible as possible in order to encourage its use by clinicians. We apply our method to study a cohort of patients suffering from progressive multiple sclerosis and healthy subjects. We show that it can help estimate the severity of the disease and also be used for longitudinal follow-up to detect an evolution of the disease or other phenomena such as asymmetry or outliers.


Introduction
Longitudinal follow-up and inter-individual comparison of gait trials are relevant for patients suffering with many degenerative diseases [1]. One example of such a disease is progressive Multiple Sclerosis (MS), for which gait is considered the most important source of disability [2]. Throughout this article, we will use MS as an example to illustrate our approach. Those intra/inter-individual comparisons are usually performed using semi-quantitative clinical scales, but those have limitations. In the case of MS, several clinical scales exist such as the Expanded Disability Status Scale (EDSS) [3], the Multiple Sclerosis Walking Scale-12 [4], and the Fatigue Impact Scale [5,6]. In this study, severity of the disease was evaluated using the a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 EDSS, which is a score from 0 to 10, ranging from normal neurological examination (0) to total impotence (9.5) or even death (10) by increments of 0.5. Those scales provide semi-quantitative or qualitative criteria for disease follow-up but they have been criticized for several reasons, including a lack of objectivity and sensitivity to clinical evolution [7][8][9][10][11][12]. This motivates the use of quantitative methods.
Over the past few years, gait quantification was made easier by the development of light, affordable inertial measurement units (IMUs). IMUs are portable systems integrating accelerometers, gyroscopes, and magnetometers that allow the synchronized measurement of linear accelerations and angular velocities in one single light, low-cost device [13]. Standard features such as velocity, step time or step length can be extracted from IMU signals. They have been used to discriminate healthy subjects from patients, or groups of patients with different levels of disease severity, but those studies rely on long protocols (walking for several minutes to get a high number of steps) [14][15][16] and/or gait event detection [14,15,[17][18][19][20]. Long protocols are incompatible with patients with severely altered gait who have trouble walking a few meters, and are more difficult to include in clinical day-to-day practice. Gait event detection is either performed using expensive equipment (pressure sensitive mats, motion capture) or complex algorithms. However, those complex algorithms are difficult to apply to gait signals from pathological subjects with severely altered steps. For example, in step detection, the error on the detected start/end time of steps is typically around 10 ms for healthy subjects (HS) [21] and around 100 ms for severely affected MS patients [22]. Moreover, the above studies do not perform comparisons between different trials of the same subjects at different dates, in which case changes can be more subtle depending on the progression of the disease. This raises the question of how to compare gait trials, especially when some of them are from pathological subjects.
The goals of this article are to present a method to compare pairs of gait trials, visualize the results of all the comparisons and interpret them. Our approach is based on Topological Data Analysis (TDA), which we use to define a distance between gait signals, allowing us to compare gait trials. We then use a visualization algorithm to represent each trial as a 2D-point and compute features to study the structure of the obtained point cloud. By dividing the point cloud into different groups, our method makes it possible to perform both global studies to find differences in gait for different levels of severity of the disease, and longitudinal studies about the evolution of patients' gait in time. TDA is a set of techniques derived from algebraic topology, which allows to analyze the structure of data by looking at it at different scales, and to describe the evolution of their arrangement. (see [23,24] for a detailed introduction). The main idea behind TDA is that data are a finite subset of samples of an underlying mathematical set, whose structure can bring useful information about the system under study. For instance, a gait signal is represented by a time series, i.e. a uniform sampling of a continuous 1-dimensional physiological signal. In this setting, one of the main TDA techniques, so-called persistent homology, can be used to study the underlying continuous signal through the finite time series. TDA has been applied to time series in medicine and biology since the 2010s. Applications include the study of cardiac arrythmia with electrocardiograms [25], motor learning with fMRI data [26,27], gene expression time series [28], wheeze in breathing signals [29], epileptic seizures with electroencephalograms [30], the spread of COVID-19 [31] and autism spectrum disorder [32].
TDA has been applied to the study of locomotion through time series. Motion capture data has been analyzed using TDA to model bipedal walking [33] or to perform action recognition (classification between dance, jump, run sit and walk) [34]. It has also been used to study degenerative diseases by performing binary classification of time series of gait parameters (stride, stance, and swing time) between healthy and pathological subjects (suffering from either Parkinson's disease, Huntington's disease or Amyotrophic lateral sclerosis) [35,36], multi-class classification of ground reaction force time series to assess the severity of Parkinson's disease [37], or detection of freezing-of-gait episodes [38]. To the best of our knowledge, TDA was mainly used to produce features that were fed to machine learning algorithms (such as SVM, random forest, nearest neighbors or deep neural networks) [25,32,[34][35][36][37][38][39]. Topological features increased their performance, but they are more difficult to interpret than traditional ones (such as, for the study of locomotion, speed, step length, step time etc. . .) so the interpretability of the methods used in those articles is not studied. In this article, we propose an interpretable TDA-based method to compare gait trials. More precisely, we use objects from TDA to represent gait trials as points in a space in which a distance can be defined. This distance can be interpreted in terms of signal oscillations and used to compare gait trials. We applied our method to study a cohort of healthy and pathological subjects as a whole, and performed both inter-individual and intra-individual comparisons. In addition, the method has the advantage of working for time series measured with light, affordable IMUs during a protocol used in clinicians' day to day practice.
In the first section, we describe the protocol applied to construct our dataset, introduce the method and its applications, and describe the mathematical concepts required to understand it. In the second section, we present the results of the application of our method to study MS. In the third section, we analyze and discuss those results.

Protocol and data
Our dataset is composed of gait trials from 22 MS patients and 10 young HS. The studies involving human participants were reviewed and approved by Protection des Personnes Nord Ouest III (ID RCB: 2017-A01538-45). The patients/participants provided their written informed consent to participate in this study. The protocol is a walking exercise consisting in a 12m walk with a U-turn while wearing 4 XSens 1 sensors (XSens 1 Technologies, Enschede, the Netherlands; autonomy 6 h, device dimension 47 × 30 × 13 mm, weigth 16 g, acceleration range ±160 m/s 2 , angular velocity range ±2000 deg/s, dynamic accuracy roll/pitch 0.75 deg RMS, dynamic accuracy heading 1.5 deg RMS): one on the dorsal part of each foot (left foot: LF, right foot: RF), one on the lower back (T) and one on the head (H), fixed using a Velcro band designed by XSens 1 . Additional measurements including average walking speed were done for each trial using information from a GaitRite 1 mat, which detects the initial and final contacts of the feet on the ground. The experiment was conducted in two sessions, 6 months apart, that will be referred to as M0 and M6. During each session, the protocol was performed twice. The part of the signals corresponding to the U-turn was automatically removed (using the GaitRite 1 data, as the U-turn happened outside the mat) so that each exercise gives two signals: one for the forward path and one for the return. To sum up, for each IMU (LF, RF, T, or H) of each subject, there are 8 trials: 4 for M0 (F1: forward 1, R1: return 1, F2: forward 2, R2: return 2) and 4 for M6 (F3, R3, F4, R4).
XSens 1 sensors are inertial measurement units (IMUs) that measure the 3D accelerations, 3D angular velocities and 3D magnetic fields. The axes are defined on Fig 1: the Y-axis is parallel to the ground and orthogonal to the walking direction. The data were sampled at 100Hz. For our study, we used the angular velocity around the Y-axis (Gyr-Y) from the feet IMUs (LF, RF), which provides signals suitable for gait assessment as it corresponds to the rotation of the foot around the medio-lateral axis. [21,[40][41][42][43]. In what follows, we will refer to those Gyr-Y signals as gait signals.
All the subjects were recruited between June and September 2018 at Percy Hospital (Clamart, France). The characteristics of the subjects are displayed in Table 1. Seven out of the 22 participants had an advanced disease requiring permanent walking aid (cane(s), walker and/or human help). Two patients needed human help to perform the walking test. Included participants in the MS group had an EDSS between 2 and 6.5, as disabilities greater than 7 completely impede walking. Fig 2 shows an example of gait signals for both feet of a healthy subject. Gait signals can be described as a succession of gait cycles, which are composed of a support phase (when the foot touches the ground) and an oscillation phase (when it is off the ground) [21]. The support phase starts with the heel strike (when the foot hits the ground) and ends with the toe off (when the foot leaves the ground) as shown in Fig 3. The plateau around 0 for angular velocity corresponds to the phase between the foot flat and the heel off events. During the oscillation phase, the angular velocity goes up, stays almost constant and then decreases until the next heel strike.

Overview of our method and guidelines for the clinician
Here, we give an informal description of the objects used in our method and explain how we use them, followed by guidelines on how clinicians can use our method. The mathematical concepts corresponding to terms in bold will be described in the following sections.
Overview of our method. The goal of our method is to produce a quantitative analysis of a database of gait signals, by using comparisons based on topological properties. Given a  database of gait signals and a partition of those signals into groups, it outputs a 2D point cloud and a list of features for each group of the partition. We start by constructing a topological summary of each signal called a persistence barcode. A persistence barcode is a set of bars that represents the oscillations of a signal, where a long bar corresponds to a large variation. Each signal is represented by its barcode. There exists a notion of distance between barcodes called the bottleneck distance, that we use to compare

PLOS ONE
pairs of signals via their barcodes. Before computing the bottleneck distance, we remove the longest bars from each barcode: if the barcode corresponds to a trial with k steps, we remove the k longest bars. This makes the distance less sensible to the number of steps and more sensible to the oscillations of the signal (we explain why in the mathematical description).
After having computed all the distances between pairs of barcodes, each barcode is represented as a point in the 2D Euclidean space using a dimension reduction algorithm called UMAP. This algorithm outputs a 2D point cloud whose structure is as close as possible to the structure of our set of barcodes endowed with the bottleneck distance. The obtained point cloud is a visualization of all gait signals arranged based on their topological similarity. For a given partition of the dataset, each point can be colored according to its group in order to visualize the groups on the point cloud. For each group, we compute three features: its silhouette score with respect to other groups (to measure their separability), and its mean squared distance and diameter (to measure its density).
Our method can be summarized as follows (also see Fig 4): • Input: a database of gait signals and partitions into groups.
• Construct the persistence barcode from all the gait signals.
• For each signal, count its number of steps k and remove the k longest bars from its barcode.
• Compute the bottleneck distance between all pairs of those barcodes.
• Output: the point cloud and, for each partition, the silhouette score, mean squared distance, and squared diameter for all (pairs of) groups.
Guidelines for the clinician. Here, we give guidelines on how to concretely use our method to study cohorts of patients. For users who do not want to dive into the mathematical details, all the intermediate steps of the above summary of the method can be considered to already be implemented.
• Construct a database of gait signals (preferably with information on the number of steps, if not, compute it automatically as we explain later).
• Recover the point cloud.
• Partition the cohort into groups based on additional information (clinical scales, healthy/ pathological, different sessions, right/left foot etc. . .). Each partition is defined to study a specific aspect of the cohort.
• For each partition: color points belonging to different groups in different colors, and recover tables containing all the features (silhouette scores, mean squared distances and squared diameters, or others if needed).
To interpret the results, the key idea is that the relative distances between points on the point cloud reflect the difference of structure of the oscillations of the signals.
For a given partition, if two groups are well separated on the point cloud, then the criteria that define the groups are related to the differences of gait trials. For example: if the points from the M0 session of a given patient are well separated from the M6 points, then there has been a significant change in the patient's gait and the clinician can interpret it as an evolution of the disease. On the contrary, if points of both session are not separable and form one dense group, then there is no intra/inter-session variability. If healthy subjects are well separated from patients, then the disease has an impact on gait. If points from a same session of a subject have a large mean squared distance (compared to other subjects), this means that there is a high intra-session variability. If points representing signals from the IMU placed on the left

PLOS ONE
foot of a subject are well separated from those representing right foot signals, then there is an asymmetry.
We followed those guidelines in our study, which we describe in the last sections of this article.

Mathematical description our of method
This section describes the mathematical construction of the objects from TDA used in our method.
Persistence barcodes from sublevel sets. We now explain how to perform TDA on time series using sublevel sets. For a given real-valued function f: t 7 ! f(t) and threshold a 2 R, the sublevel set F α is defined as As explained above, our goal is to study the evolution of the arrangement of data through different scales. This evolution can be summarized by a so-called persistence barcode. Formally, the persistence barcode (from sublevel sets) of a signal described by a function f is the set of pairs (date of birth, date of death) of the connected components of the sets F α as α goes from −1 to + 1. That is to say, for a given α, if F α has a connected component with no point belonging to any F β such that β < α, we say this component was born at α. If two components from F β , β < α have merged in F α then we say that the youngest one died at α.
The persistence barcode of the sublevel sets of a time series can be constructed by pairing local minima to local maxima using the following algorithm (illustrated in Fig 5): 1. Mark the level on the Y-axis of all the local extrema of the signal. The first and last points can be ignored if they are local maxima.
2. Start drawing a vertical bar going up from the global minimum.
3. Each time the bars reach the level of a another local minimum, start another vertical bar at this minimum. Then make all the bars go up to the level of the next extrema. 4. Each time the bars reach the level of a local maximum, if that point has one bar at its left and one at its right, then the shortest of those two bars stops growing. Then make all the bars go up to the level of the next extrema.
5. When the bars reach the global maximum, stop, as the remaining bar will keep growing indefinitely.
6. The persistence barcode is made of all the pairs of (start, end) vertical coordinates of the bars obtained this way (we ignore time coordinates), where the longest bar goes up to + 1.
It is usually represented horizontally as in Fig 6. This representation is obtained by keeping only the bars and Y-axis, and rotating the graph by 90˚clockwise.
Let us now describe the persistence barcode corresponding to a single gait cycle. Persistence barcodes of time series can be understood in terms of pairs of local minima and maxima. On  Fig 3, four important local extrema can be noticed: • One minimum between the heel strike and foot flat (point A on Fig 3). The angular velocity keeps decreasing for some time after the heel strike before going back to zero at foot flat.
• One maximum around zero at the plateau between foot flat and heel off (point B on Fig 3).

PLOS ONE
Topological data analysis for the study of locomotion • A second minimum just before toe off (point C on Fig 3). Heel off makes the angular velocity decrease below zero and toe off makes it go back above zero.
• A second maximum during the oscillation phase at the high plateau of the gait cycle (point D on Fig 3).
Let (t P , y P ) denote the coordinates of a point P of a time series. The barcode of the signal on Note that only one bar goes to infinity, any other bar corresponding to a pair (P, Q) has coordinates (y P , y Q ). The smaller bars are oscillations. For example, on the LF barcode, the 7 th , 8 th and 9 th bars (counting from top to bottom) correspond to the oscillation that happens just after heel strike during each gait cycle (Fig 3 shows where the heel strike happens on a signal). Those oscillations are less present on the RF signal.
Note that counting the longest bars is equivalent to counting the steps (including the last step, that may be incomplete): on Fig 6, the three long bars correspond to the three steps visible on Fig 2. Distance between barcodes. To compare persistence barcodes, a distance called the bottleneck distance can be used [23]. Recall that a barcode is a set of pairs (x, y) that are the start and end vertical coordinates of each bar. The same pair can be represented multiple times and y can be equal to +1 (this happens exactly once if the signal is defined on an interval). For barcodes B and B 0 , the bottleneck distance is based on an idea from optimal transport, using bijections between the two barcodes (functions from B to B 0 such that each bar from B 0 is associated to a unique bar from B). Let Γ(B, B 0 ) be the set of bijections from B to B 0 . Note that if two finite sets do not have the same number of elements there are no bijections between them, so we include to B and B 0 all bars (x, x) of length zero (an infinite number of times), so that there always exists a bijection between B and B 0 . For any γ 2 Γ(B, B 0 ) and any bar b = (x, y) 2 B such that γ((x, y)) = b 0 = (x 0 , y 0 ), the two bars can be compared using the infinite norm: For each γ, the pair of bars (b, b 0 ) such that kb − γ(b)k 1 is maximal gives a notion of similarity between B and B 0 induced by the pairing of bars defined by γ. The bottleneck distance is defined by choosing the bijection that minimizes this quantity (which means that we associate each bar of B to the one in B 0 that is the most similar). Formally, the bottleneck distance between B and B 0 is given by:

PLOS ONE
Topological data analysis for the study of locomotion Stability theorems [44][45][46][47] prove that under generic assumptions, barcodes associated with similar signals are close for the bottleneck distance.
As explained above, the number of long bars in barcodes from gait signals is the number of steps. This implies that the bottleneck distance between two barcodes corresponding to trials with a different number of steps will be high. Indeed, in that case, one of the two barcodes will have more long bars than the other so each bijection γ will pair at least one long bar b to a short bar γ(b), resulting in a high kb − γ(b)k 1 and thus in a high distance.
This means that the bottleneck distance will mainly distinguish signals that have a different number of steps. However a different number of steps can be due to many factors such as experimental conditions, the subject's height, age, or the foot that does the first step (a RF and a LF signal from the same exercise can have a different number of steps if a subject starts and ends with the same foot). To reduce this step-counting effect and focus more on oscillations, we propose to count the steps on each signal and remove the k longest bars from the corresponding barcode, where k is the number of steps. The number of steps can be computed from signals using the autocorrelation function (ACF) of each signal. The time when the second peak of the ACF is reached is the duration of the first gait cycle, and the number of steps can be deduced from this quantity and the duration of the trial. This method is heuristic and has limitations, notably with signals from patients with very deteriorated gait. For any future clinical use of our method, steps could be counted during the protocol and included in the data.
Once the barcodes from every gait signal of the database have been computed, the next step of our method is the following: for each pair of barcodes B and B 0 corresponding to trials with respectively k and k 0 steps, remove the k (resp. k 0 ) longest bars from B (resp. B 0 ) to get a new barcodeB (resp.B 0 ) and compute d Bot ðB;B 0 Þ. Visualization algorithm. Barcodes (and the gait signals they represent) can be seen as points in a (non-Euclidean) metric space, endowed with the bottleneck distance. To be visualized, these points need to be projected onto the 2D or 3D Euclidean space. Several algorithms can compute such a projection, including the UMAP algorithm [48], t-SNE [49] and multidimensional scaling (MDS) [50]. MDS focuses on respecting the distance matrix, while UMAP and t-SNE intend to represent the structure of the original metric space. We chose UMAP to focus on structure, and because its parameters can be chosen so that more global structure is preserved than with t-SNE.
The UMAP algorithm has been used in applications such as the study of odors and molecular structures [51], physical and genetic interactions [52] or genomic data [53].
The UMAP algorithm takes as input a distance matrix (here, the matrix of all bottleneck distances between all pairs of barcodes) and two parameters: n_neighbors and min_dist. It outputs a 2D (or 3D) point cloud where each point represents a gait signal, whose structure induced by the Euclidean distance is as close as possible to the structure induced by the bottleneck distance on the space of barcodes. That is to say, if two barcodes are close according to the bottleneck distance, then the corresponding points in the point cloud will be close according to the Euclidean distance.
The two parameters control the compromise between respecting the local and global structure of the data. A low n_neighbors parameter makes the UMAP algorithm focus more on the local structure around each point, whereas a high n_neighbors will make it focus on the global structure. The min_dist parameter is the minimum distance allowed between two points in the point cloud. A low min_dist allows the algorithm to represent similar barcodes as close points in the point cloud. A high min_dist will prevent it to produce very dense neighborhoods to make the global structure appear more clearly.
In what follows, we use UMAP with the metric induced by the bottleneck distance, with parameters n_neighbors = 45, min_dist = 0.3 and in 2D. See Fig 7 for an example. Interactive versions of the plots are provided with this article.

Feature extraction
The goal of our method is to study a database of gait signals. As the UMAP projection preserves the structure induced by the bottleneck distance between barcodes, information can be extracted by studying the relative positions and neighborhood relations of points in the point cloud. To do this, we regroup signals that share a given characteristic (for example: both being from a healthy/pathological subject, or from the same session) and study the geometry of the groups and the relations between groups, using three features: the silhouette score (Sil), the mean squared intra-group distance (MSD) and the squared diameter (SD).
Note that, when using a UMAP projection, no information can be extracted from the absolute coordinates of the points or the absolute distance between two points. The position of a point should only be studied through its distance to other points, and distances should be studied in a relative way. For example, saying that point A and point B are closer together than they are with point C means that signal A is more similar to signal B than to signal C.
Let S = (s i ) 1�i�n be a database of n gait signals and X = (x i ) 1�i�n be a (2D or 3D) point cloud obtained with the above method, where each point x i represents a signal s i . Let (C i ) i2I be a partition of X into groups.
Silhouette score. Let i, j be two distinct indices in I, |C| denote the cardinal of set C and k.k 2 denote the Euclidean norm. The silhouette score of a point x 2 C i , with respect to group j is defined as: where a ¼ 1 jC i jÀ 1 P y2C i ;y6 ¼x kx À yk 2 is the mean distance between x and all other points in the same group, and b ¼ 1 y2C j kx À yk 2 is the mean distance between x and all points in group j.

PLOS ONE
Topological data analysis for the study of locomotion The silhouette score of group i with respect to group j is defined as the mean silhouette score of the points of group i with respect to group j: Mean squared distance. The MSD of group i is defined as: Squared diameter. The squared diameter of group C i is defined as: Note that we have squared the distance to be consistent with the MSD.
Interpretation of the features. The silhouette score is a clustering evaluation metric that is used to determine if the groups we define can be considered as clusters in our point cloud. It takes values between -1 and 1. A value close to 1 means good clustering (the groups are well separated from one another and dense), a value close to 0 means overlapping groups, and a value close to -1 means bad clustering.
To understand this, let us consider concrete examples. For a point x 2 C 1 , Sil(x, C 2 ) = 0.5 means that b = 2a (using the same definition as above for a and b), i.e. that x is on average twice as far from points of C 2 than from points of C 1 . Thus, Sil(C 1 , C 2 ) = 0.5 means that on average a point of C 1 will be twice as far from C 2 than from C 1 . On the contrary, Sil(x, C 2 ) < 0 means that x is on average closer to C 2 than to C 1 , and having x in C 2 would increase the score. Negative scores are thus interpreted as bad clustering. If Sil(x, C 2 ) is close to zero, then the difference between a and b is small compared to the size of the groups, so x can be considered to be as close to C 1 and C 2 , i.e. x is "between C 1 and C 2 ". A small Sil(C 1 , C 2 ) then means that the two groups are overlapping.
The fact that values are always between -1 and 1 is an advantage of the silhouette score compared to other clustering evaluation metrics because it can be interpreted on its own without necessarily being compared to the score of a different clustering. Note that the silhouette score is not symmetrical: Sil(C i , C j ) is not necessarily equal to Sil(C j , C i ).
The MSD and SD measure the density of the groups. They should only be interpreted relatively to other groups. The MSD measures the average (squared) distance between points of the same group, so a smaller MSD means that a group has points that are closer to one another on average. The SD measures the largest of those distances. The SD is complementary to the MSD because it focuses on the two points that are the furthest apart. For example, a group C 1 of points uniformly spread on a line and a group C 2 with one point at the beginning of the line and all the other points at the end would have the same SD but C 2 would have a significantly lower MSD as it is very dense except for one outlier. A joint analysis of the MSD and SD can thus detect outliers in a group with relatively low MSD and high SD.

Results
We applied our method to study the database of gait signals described above. The study is divided in three parts: the first one compares healthy subjects (HS) to multiple sclerosis (MS) patients, the second one is a series of experiments that compare subjects with different EDSS scores, and the third one studies the evolution of each subject between M0 and M6. Each experiment corresponds to a different partition of the database.

HS/MS experiment
Here, we divide our database into two groups: HS and MS. HS is the group of points from healthy subjects and MS is the group of points from multiple sclerosis patients.
This partition can be visualized on

EDSS experiments
In this section, we perform a series of experiments to study the relation between the EDSS and the relative position of points on the point cloud. For a given threshold i, we divide our database into two groups: {EDSS � i} and {EDSS > i}. {EDSS � i} is the group of points from subjects with EDSS lower than or equal to i and {EDSS > i} is the group of signals from subjects with EDSS strictly higher than i. HS are given an EDSS of 0. We consider the following values of i: 0, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, and 6 as there are no patients with EDSS under 2 or above 6.5. Note that the experiment with i = 0 is the HS/MS experiment.
The EDSS corresponding to each point can be visualized on Fig 8. Table 3 shows the Sil, MSD and SD values for each partition ({EDSS � i}, {EDSS > i}). In this table, for the sake of clarity, we use the notation Sil(� i, > i) instead of Sil({EDSS � i}, {EDSS > i}).

PLOS ONE
All silhouette scores are positive. For each i, at least one of the two silhouette scores Sil Result using walking velocity instead of TDA. Fig 9 shows the point cloud obtained by performing the same experiment except that the bottleneck distance was replaced by the difference of walking velocity (in m/s) between trials. Points are colored according to the EDSS of the corresponding subject.

Longitudinal experiment
In this section, for each subject, we compare the M0 session to the M6 session. We start by dividing our database into 32 groups (each corresponding to one subject): each subject is given an ID between 1 and 32, and group i corresponds to signals from subject i. Then, each group i is subdivided into two groups M0 i and M6 i . M0 i is the group of signals from the M0 session of subject i, M6 i is the group of signals from their M6 session. The final partition has 64 groups: Table 3. Features for points with EDSS lower or equal to/ stricly higher than each threshold. i

Sil(� i, > i) Sil(> i, � i) MSD({EDSS � i}) MSD({EDSS > i}) SD({EDSS � i}) SD({EDSS > i})
The goal of this experiment is to study the evolution of each patient, therefore silhouette scores are only computed for pairs (M0 i , M6 i ). The partition into 32 groups and the partition into 64 groups can be visualized on the interactive plots provided with this paper (along with those corresponding to Figs 7 and 8). Table 4 shows the Sil, MSD and SD values for each pair (M0 i , M6 i ).
Study of subject 12. The two highest silhouette scores are from subject 12: Sil(M0 12 , M6 12 ) = 0.83 and Sil(M6 12 , M0 12 ) = 0.96. Subject 12 has an EDSS of 4 at M0 and 5 at M6 and is the only MS patient to have a variation of their EDSS of more than 0.5.

PLOS ONE
Topological data analysis for the study of locomotion Study of subject 2. The group M0 2 of points from the M0 session of subject 2 has the highest MSD and SD of Table 4: MSD(M0 2 ) = 24.9 and SD(M0 2 ) = 99.9. The second highest MSD is about 13 and the second highest SD is about 37.
On Fig 7, the blue point which is the furthest on the right is from the M0 session of subject 2. (Table 2) quantify what could be observed on Fig 7: the HS and MS groups form clusters with high silhouette scores (over 0.4), but there is some overlap. The HS group is denser that the MS group, which can be explained by the fact that the disease is more severe for some patients than others.

HS/MS and EDSS experiments. The features from the HS/MS experiments
In Fig 8, a global continuity of the color of points can be observed from left (dark blue, low EDSS) to right (dark red, high EDSS). The goal of studying all partitions ({EDSS � i}, {EDSS > i}) is to quantify this left/right continuity. The silhouette scores on Table 3 show that {EDSS � i} and {EDSS > i} almost always form satisfactory clusters, except for {EDSS � i} when i is above 5 (in that case, the group is sparse and patients with EDSS above 4 are far from the HS). This shows that our method reflects the global progression of the disease by placing MS patients with a low EDSS closer to HS than to patients with a high EDSS.
Longitudinal experiment. The objective of this experiment is to study each subject independently from the others to compare their M0 and M6 sessions. The idea behind our approach is that gait signals cannot be compared to an absolute reference but an evolution can be detected by comparing a subject at M6 to himself at M0, thus taking M0 as the reference. Analyzing the signals from subjects with significant values in Table 4 allowed us to highlight three different phenomena: • A significant change in subject 12's gait between M0 and M6, which we deduce from the fact that the groups M0 12 and M6 12 are almost completely separable (see Fig 10). This evolution

PLOS ONE
Topological data analysis for the study of locomotion of gait can be linked to the significant evolution of the patient's disease, as their EDSS goes from 4 to 5 (and is the only one that has a variation of more than 0.5).
• An asymmetrical gait at M6 for subject 15. Usually, the M0 and M6 silhouette scores of a given patient are close because if M0 is separable from M6 then M6 should be easily separable from M0. For subject 15, those scores are Sil(M0 12 , M6 12 ) = 0.49 and Sil(M6 12 , M0 12 ) = −0.21. This can be explained by the fact that the M6 group is sparser (its MSD and SD are among the highest of all sessions). The M6 group is split in two parts of four points each (see Fig 10). One part (the one closer to the M0 group) is made of the four RF signals of the M6 session, and the other one is made of the LF signals. The significant values for subject 15 on Table 4 can thus be explained by the apparition of an asymmetry at M6. This explanation is supported by clinical evidence.
• An outlier and a technical issue in the signal's acquisition. Subject 2's M0 group has a MSD and a SD significantly higher than every other group. It is due to the outlier of the HS group (the blue point on the right of Fig 7), which belongs to M0 2 . Visualizing this outlier allowed us, by going back to the associated gait trial, to highlight a segmentation problem during the construction of the database. Note that the MSD detects the outlier because the remaining points of the M0 2 group have a low density. If it had been denser the MSD would have been lower but the SD would be similar. This justifies using the squared diameter as a complementary density measure. Asymmetry can be detected this way in other patients such as for subject 21, and even, at a lower scale, in the HS group such as for subject 6.  Table 4.

Comparison to state of the art
The method presented in this article has the advantage of being non-parametric, except for the two UMAP parameters, for which the default values suggested by the authors of [48] seem to be appropriate. Moreover, it only relies on raw gait signals and does not need additional information such as step annotations or step detection (although, as mentioned before, knowing the number of step for each signal can be useful). Thus, it can be applied to any type of gaitaffecting pathology without having to choose new parameters or previously perform step detection or manual annotations.
Our study of multiple sclerosis shows that the method can identify a global correlation between the severity of the disease represented by the EDSS and distance to HS points, and also detects changes in the patients' gait. Fig 9 shows a point cloud made of dense groups of points that are completely separable from each other. Several groups are made of points with different EDSS scores, and there does not appear to be any way to correlate the distribution of points with their EDSS. This can be explained by the fact that velocity is significantly impacted by other factors than the disease and thus two subjects with different clinical conditions can have the same velocity. In particular, this means that, compared to our approach, the difference of velocity could not be used to detect clinical evolution. Using other standard gait features such as step length, step time (or its variation coefficient) or double stance time (or its variation coefficient) gives similar results to those obtained with velocity.
The use of TDA with sublevel sets to create persistence barcodes and compute distances between them naturally allows to compare signals from the left foot to signals from the right foot, as those barcodes are invariant by translation along the time axis. Allowing comparison between different feet doubles the number of points for each session and thus makes the following analysis more relevant. Moreover, it provides a way to detect asymmetry in a subject's gait which can further assist clinical gait evaluation. For subjects who have a strong asymmetry, it may be necessary to separate RF and LF signals to study their evolution as the effect of an asymmetry on the distance between points may dominate the effect of any other phenomenon.

Limitations
This work was limited by the size of the database, that only contains signals from 22 MS patients, and some EDSS values are not represented (4.5 or values under 2). Because of those missing values, we could not study the impact of MS on gait at its earliest stages. Indeed, our study of the ({EDSS � i} and {EDSS > i}) groups would allow us to quantify how close patients with low EDSS are to healthy subjects and refine our analysis of the progression of the disease. Having more than two sessions per subject would also be beneficial for the longitudinal study.
The second limitation is the access to a ground truth. We used the EDSS as a measure of the severity of the patients' disease, but it is limited by its lack of objectivity and of sensitivity to change (and so are other clinical scores for MS) [7][8][9][10][11][12]. Indeed, in the studied cohort, EDSS scores do not vary by more than 0.5 between M0 and M6 in all cases except one, and often stays the same between two sessions whereas for several patients our method clearly separates the M0 and M6 points. A different ground truth thus seems necessary to compare the results of our method to the conclusions obtained with clinical scores.

Perspectives
More work may be done to test our method on more patients to study MS or other pathologies including some that involve a left/right asymmetry.
The method can be generalized to analyze different physiological signals or any type of time series, as the only step that is specific to the study of locomotion is the one when bars are removed from persistence barcodes according to the number of steps. Future work may also include using different TDA techniques to improve our method for gait signals or to apply it to other types of signals. An example of such a technique, that is widely used in the literature on TDA for time series (including [28,29,35,36,38,39]), is the delay embedding, which is a way of transforming a time series into a multi-dimensional point cloud. Using a delay embedding to represent a time series as a d-dimensional point cloud allows to study its persistent homology in dimension 0 to d − 1 (one persistence barcode can be computed for each dimension), and different dimensions may contain complementary information. A similar approach could also be used to deal with multivariate data. One of the challenges of using a delay embedding is that it makes the method more parametric (it introduces at least two parameters: the dimension of the embedding and the delay) and more difficult to interpret than when using sublevel sets (in which case barcodes can be interpreted in terms of oscillations, as explained above).

Conclusion
This article has two main contributions: a non-parametric method to study gait signals and visualize the results, and an application to study multiple sclerosis both globally and in a longitudinal way. Our method is based on techniques from topological data analysis, which relies on algebraic topology. Our goal was to present the method in a way that requires no background in topological data analysis to insist on the ideas behind it and make it more easily usable by clinicians.